Reliability and Validity

Reliability refers to the degree to which a test is consistent.

Validity refers to the degree to which the test actually measures what you want to be measured.

Note 1: Reliability is the necessary but not sufficient condition to Validity. A reliable measure can be driven by systematic artefact rather than valid signal. However, Validity is a sufficient condition for reliability. A highly valid measure must be reliable. See equations below

Var_t: Variation of the trait of interest in the measurement

Var_c: Variation of the contaminants in the measurement (e.g. systematic noise, unwanted signal)

Var_r: Variation of the random noise

Specifically, if a test has a 0.6 reliability, the validity of this test can range between 0 to 0.6 depends on how much of this test actually measures the specific trait of interest. In other words, reliability is the upper bar of the validity. If a test has a 0.6 validity, meaning that the true score is 60% consistently measured in the observations, the reliability of this test must be >= 0.6. If there is a consistent unwanted signal contaminate the observed score. The consistent signal (both ture and unwanted scores) would make the reliability over 0.6.

Note 2: Validity is specific for the trait of interest. A test can be highly valid for one trait but not valid for the other trait. For example,

Theoretical relationship between Reliability and Validity (3D)

  • x-axis: contaminator (i.e. consistent unwanted signal)
  • y-axis: errors (i.e. random noise that makes the observation vary randomly)
  • z-axis: reliability/validity

Reliability colormap: Validity colormap: